Unsupervised sequential organization for cochannel speech separation

نویسندگان

  • Ke Hu
  • DeLiang Wang
چکیده

The problem of sequential organization in the cochannel speech situation has previously been studied using speaker-model based methods. A major limitation of these methods is that they require the availability of pretrained speaker models and prior knowledge (or detection) of participating speakers. We propose an unsupervised clustering approach to cochannel speech sequential organization. Given enhanced cepstral features, we search for the optimal assignment of simultaneous speech streams by maximizing the betweenand within-cluster scatter matrix ratio penalized by concurrent pitches within individual speakers. A genetic algorithm is employed to speed up the search. Our method does not require trained speaker models, and experiments with both ideal and estimated simultaneous streams show the proposed method outperforms a speakermodel based method in both speech segregation and computational efficiency.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cochannel speech separation using multi-pitch estimation and model based voiced sequential grouping

In this paper, a new cochannel speech separation algorithm using multi-pitch extraction and speaker model based sequential grouping is proposed. After auditory segmentation based on onset and offset analysis, robust multi-pitch estimation algorithm is performed on each segment and the corresponding voiced portions are segregated. Then speaker pair model based on support vector machine (SVM) is ...

متن کامل

Model-based sequential organization for cochannel speaker identification

It is difficult to directly apply traditional speaker identification (SID) systems to cochannel speech, mixtures from two speakers. Previous work demonstrates that extraction of usable speech segments significantly improves SID performance if speaker assignment, or sequential organization of the segments, is known. We derive a joint computational objective for speaker assignment and cochannel S...

متن کامل

Incremental largest margin linear regression and MAP adaptation for speech separation in telemedicine applications

In this paper, a novel technique of online incremental speaker adaptation for speech stream separation in telemedicine is proposed. An unsupervised discriminative linear regression technique is developed based on the principle of maximizing the class separation margin to transform model mean. This adaptation approach is called largest margin linear regression (LMLR). Online incremental LMLR and...

متن کامل

An iterative model-based approach to cochannel speech separation

Cochannel speech separation aims to separate two speech signals from a single mixture. In a supervised scenario, the identities of two speakers are given, and current methods use pre-trained speaker models for separation. One issue in model-based methods is the mismatch between training and test signal levels. We propose an iterative algorithm to adapt speaker models to match the signal levels ...

متن کامل

Speech Segregation based on Binary Classification

Speech segregation is a fundamental challenge in speech and audio processing. This AFOSR project aimed to develop a speech segregation system that can potentially improve speech intelligibility in noise for human listeners. Motivated by the perceptual principles of auditory scene analysis and the speech intelligibility studies of ideal time-frequency masking, the project sought to develop a cla...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010